Proximity-based engagement with digital assistants

ABSTRACT

A proximity of a first computing device with a second computing device can be detected. In response to the detecting of the proximity, it can be determined that a computer-readable qualification for a type of proactive engagement is met using the detected proximity and possibly a non-proximity state of the first device. The type of proactive engagement can be matched in the computer system with an operation that is programmed to be performed using a computerized natural language digital assistant. In response to the determining that the qualification for the type of proactive engagement is met, the technique can include performing a user engagement action via the second device. The user engagement action can be of a type indicated by the type of proactive engagement. The user engagement action can facilitate a computerized communication session between a computer-readable user profile and the computerized natural language digital assistant.

BACKGROUND

A digital assistant is a computer system component (such as acombination of software and the hardware that the software runs on) thatreceives natural language input and responds with natural languageresponses in a dialog. Digital assistants typically can also performnon-dialog actions in response to natural language input commands, suchas starting a computer application in response to such a command orproviding Web search results in response to such a command.

Currently, when a user engages a digital assistant for a session betweenthe digital assistant and that user's computer-readable profile, theuser typically says a key phrase, such as an audible speech phrase thatis picked up by a microphone in a device in which the digital assistantis active, or provides a tactile input to a device. Other approachessuch as attention-based engagement models have been proposed that buildmodels of where people are attending and apply them to predictengagement. In other applications such as commerce or security,proximity-based services have been deployed.

SUMMARY

The tools and techniques discussed herein can provide usability andefficiency improvements in computer systems that utilize digitalassistants. Specifically, the tools and techniques can provide moreefficient and user-friendly engagement with a digital assistant, byusing detected proximity between multiple devices as well as the stateof at least one of the devices to determine that a qualification for atype of proactive engagement is met. In response to this determination,the computer system can trigger a corresponding engagement action by thecomputer system. For example, the engagement action can be tailored tothe type of proactive engagement, which can be one of multiple optionsfor proactive engagement in the computer system. Such tools andtechniques can provide a more efficient and user-friendly engagementwith a digital assistant.

In one aspect, the tools and techniques can include detecting aproximity of a first device with a second device. The first and seconddevices can each be a computerized device in a computer system, eachhaving a user interface component. In response to the detecting of theproximity, it can be determined via the computer system that acomputer-readable qualification for a type of proactive engagement ismet, with the determination using the detected proximity and anon-proximity state of the first device (such as a state of performing atask on the first device). The type of proactive engagement can be oneof multiple available computer-readable options for proactiveengagements in the computer system. In response to the determining thatthe qualification for the type of proactive engagement is met, thetechnique can include generating a user engagement user interface outputof a type indicated by the type of proactive engagement. In response tothe generating of the output, the technique can include presenting thegenerated output on the second device via the computerized naturallanguage digital assistant operating in the computer system. Thepresenting of the generated output can initiate a session with thedigital assistant in the computer system.

In another aspect of the tools and techniques, a proximity of a firstcomputing device with a second computing device can be detected. Thefirst and second computing devices can each be a computerized device inthe computer system, with each device having a user interface component.In response to the detecting of the proximity, it can be determined thata computer-readable qualification for a type of proactive engagement ismet using the detected proximity and a non-proximity state of the firstdevice. The type of proactive engagement can be matched in the computersystem with an operation that is programmed to be performed using acomputerized natural language digital assistant. In response to thedetermining that the qualification for the type of proactive engagementis met, the technique can include performing a user engagement actionvia the second device. The user engagement action can be of a typeindicated by the type of proactive engagement. The user engagementaction can facilitate a computerized communication session between acomputer-readable user profile and the computerized natural languagedigital assistant in the computer system.

In yet another aspect of the tools and techniques, a proximity of afirst computing device with a second computing device in a computersystem can be detected. The first and second devices can each have auser interface component. In response to the detecting of the proximity,it can be determined that a computer-readable qualification for aqualified type of proactive engagement is met using the detectedproximity. The qualified type of proactive engagement can be one ofmultiple available computer-readable options for proactive engagementsthat are matched in the computer system with operations that areprogrammed to be performed using a computerized natural language digitalassistant. The determining that the qualification for the qualified typeof proactive engagement is met can include selecting between at leastthe following types of proactive engagements: a first type of proactiveengagement for a first action that is a current user engagement userinterface output via the second device, with the first action initiatinga computerized communication session between a computer-readable userprofile and the computerized natural language digital assistant in thecomputer system via the second device; and a second type of proactiveengagement for a second action that is a preparatory action, whichprepares the computer system for the session in anticipation of asubsequent user input action indicating an intent to engage in thesession. Also, in response to the determining that the qualification forthe qualified type of proactive engagement is met, a user engagementaction can be performed via the second device, with the user engagementaction being of a type indicated by the qualified type of proactiveengagement, and with the user engagement action facilitating thesession.

This Summary is provided to introduce a selection of concepts in asimplified form. The concepts are further described below in theDetailed Description. This Summary is not intended to identify keyfeatures or essential features of the claimed subject matter, nor is itintended to be used to limit the scope of the claimed subject matter.Similarly, the invention is not limited to implementations that addressthe particular techniques, tools, environments, disadvantages, oradvantages discussed in the Background, the Detailed Description, or theattached drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a suitable computing environment in whichone or more of the described aspects may be implemented.

FIG. 2 is schematic diagram of a digital assistant engagement system.

FIG. 3 is another schematic diagram of a digital assistant engagementsystem.

FIG. 4 is diagram of a chart illustrating an example of a relationshipbetween distance between devices and Bluetooth® received signal strengthindication.

FIG. 5 is a flowchart of a proximity-based digital assistant engagementtechnique.

FIG. 6 is a flowchart of another proximity-based digital assistantengagement technique.

FIG. 7 is a flowchart of yet another proximity-based digital assistantengagement technique.

DETAILED DESCRIPTION

Aspects described herein are directed to techniques and tools forimproved engagement between a user profile and a digital assistant. Forexample, this can be done by using detected proximity between multipledevices, as well as the state of at least one of the devices to triggeran appropriate and tailored engagement action by the computer system tofacilitate a session with the digital assistant. Such engagement toolsand techniques may be utilized by the computer system in response to auser merely bringing one device into proximity with another device,where at least one of the devices is running the digital assistant.

Such techniques and tools may include enabling users to initiate andmaintain engagement with digital assistants, with the user only needingto bring one device into proximity with another device, where one of thedevices is running the digital assistant. Accordingly, the techniquesand tools can include: device-based engagement (implicit triggers) withdigital assistants; the system detecting whether offering support ispossible and appropriate; and basing the support on what is known or canbe computed about the current task context.

One aspect of the tools and techniques can include detecting an implicitengagement event. The event can include the juxtaposition of a user'smobile device (smartphone, tablet, laptop, etc.) and a stationary deviceor other mobile device, where one of the devices is running a digitalassistant (such as a stationary device that is an intelligent speakerdevice) as a trigger for engagement between the user's correspondinguser profile and the digital assistant.

Another aspect of the tools and techniques can include using theinferred distance (such as an estimated exact distance, or a groupeddistance or range, e.g., immediate/near/far), in some cases along withdevice motion before being placed next to the stationary device, asfactors in a technique to compute a score representing the likelihood ofan implicit engagement event occurring (such as the likelihood of anintent to engage with the digital assistant to receive a type ofassistance the digital assistant is configured to provide). Such a scorecan be used to determine whether the qualification for the type ofproactive engagement is met. For example, the qualification to be metcan be embodied in one or more computer-readable data structures usedfor testing qualification, such as learning models or qualificationrules.

Another aspect of the tools and techniques can include, upon detectionof this implicit engagement event, having the digital assistantproactively reach out to users with a user output action such as arequest to engage (e.g., via voice with a statement such as “Can I helpyou with that?”), or a different notification strategy, e.g., LED glow.Beyond voice requests, the device may also start using the screen on themobile device to present content such as the current task state on thestationary device, such as information about a current song playing onthe stationary device.

In another aspect, the tools and techniques can include the systemmonitoring the current task (e.g., application that is running or webpage being viewed) on the user's mobile device and using the informationabout that task to adapt the proactive engagement action to the currenttask (e.g., by outputting the sentence “Can I help you cook that?” fromone device while another device is displaying a recipe).

Yet another aspect of the tools and techniques can include using what isbeing displayed on one of the devices, such as a mobile device, as aclue for whether there is an intent to engage with the digital assistantusing the other device, such as an intelligent speaker device, and/orwhether the agent can help. For example, the system can determinewhether the current task is conducive to support (like recipes, if thedigital assistant is programmed to help in cooking with recipes) andwhether the task is something that the digital assistant can help with(i.e., is it in the digital assistant's skillset?). This can beconsidered in determining whether a qualification for a correspondingtype of engagement is met.

Similarly, in another aspect, the tools and techniques can include usingan action or task currently being performed on one of the devices (e.g.,the stationary device) to understand whether engagement is likely to beintended at the current time (and thus whether the qualification for atype of engagement is met).

In yet another aspect, the engagement event can be used as a trigger toqueue up relevant information for the current task in anticipation of anexplicit user request, e.g., finding food preparation videos toaccompany a recipe in anticipation of such videos being useful to answernatural language requests posed to the digital assistant. Accordingly,such queuing of information can be at least part of the engagementaction that can facilitate a session with the digital assistant.

In yet another aspect, engagement intentions and preferences for a userprofile can be learned over time based on previous engagements usingmachine learning techniques, and the system can facilitate the settingof explicit preferences in response to user input.

Using these aspects of the tools and techniques along with other detailsdiscussed herein, an improved computer system with improved digitalassistant engagement capability can be provided. For example, using theengagement techniques above, a computer system can determine when toproactively engage the user profile in a session with a digitalassistant, with the user's simple action of bringing one device nearanother device. This can improve the usability of the digital assistantand can also decrease usage of resources such as processing capacity,memory, and network bandwidth by avoiding the need to process an initialinitiating natural language command by the user, which is likelyfollowed by another command requesting a particular type of skill ortask from the digital assistant. The proactive engagement can alsoincrease the speed in which the system can provide a desired task orskill by the digital assistant, again by avoiding the need to process aninitial initiating natural language command by the user, which is likelyfollowed by another command requesting a particular type of skill ortask from the digital assistant.

Additionally, by using proximity and possibly other non-proximity stateinformation about the devices to determine whether to perform aproactive engagement action, the system can decrease the occurrence offalse positive engagements, where a proactive engagement occurs at aninopportune time. For example, proactively engaging users withoutconsidering such factors may lead to a high rate of proactive useroutputs when the system is not even technically capable of assistingwith a desired task. For example, this may occur when an active task onone of the devices is a task for which the digital assistant is notprogrammed to help. As another example, this could occur when no task isbeing performed on either device, or it is not a time when a task wouldtypically be performed on either device. In such situations, a proactiveengagement by the computer system can waste computer resources such asprocessing capability and network bandwidth, in addition to potentiallyannoying a user and decreasing usability of the system. Such problemscan be reduced by accounting for the proximity of the devices indetermining whether to trigger a proactive engagement, and such problemscan be further reduced by considering non-proximity factors such asstate of at least one of the devices (e.g., a task being performed onthe device at the time the proximity occurs) when determining whether aqualification for a type of proactive engagement is met, based on deviceproximity. Accordingly, a system implementing such features can increaseits usability and its efficiency compared to systems without suchfeatures.

As used herein, a user profile is a set of data that represents anentity such as a user, a group of users, a computing resource, etc. Whenreferences are made herein to a user profile performing actions(sending, receiving, etc.), those actions are considered to be performedby a user profile if they are performed by computer components in anenvironment where the user profile is active (such as where the userprofile is logged into an environment and that environment controls theperformance of the actions). Often such actions by or for a user profileare also performed by or for a user corresponding to the user profile.For example, this may be the case where a user profile is logged in andactive in a computer application and/or a computing device that isperforming actions for the user profile on behalf of a correspondinguser. To provide some specific examples, this usage of terminologyrelated to user profiles applies with references to a user profileproviding user input, receiving responses, or otherwise interacting withcomputer components discussed herein (e.g., engaging in a session or adialog between a digital assistant and a user profile).

The subject matter defined in the appended claims is not necessarilylimited to the benefits described herein. A particular implementation ofthe invention may provide all, some, or none of the benefits describedherein. Although operations for the various techniques are describedherein in a particular, sequential order for the sake of presentation,this manner of description encompasses rearrangements in the order ofoperations, unless a particular ordering is required. For example,operations described sequentially may in some cases be rearranged orperformed concurrently. Moreover, for the sake of simplicity, flowchartsmay not show the various ways in which particular techniques can be usedin conjunction with other techniques.

Techniques described herein may be used with one or more of the systemsdescribed herein and/or with one or more other systems. For example, thevarious procedures described herein may be implemented with hardware orsoftware, or a combination of both. For example, the processor, memory,storage, output device(s), input device(s), and/or communicationconnections discussed below with reference to FIG. 1 can each be atleast a portion of one or more hardware components. Dedicated hardwarelogic components can be constructed to implement at least a portion ofone or more of the techniques described herein. For example, and withoutlimitation, such hardware logic components may includeField-programmable Gate Arrays (FPGAs), Program-specific IntegratedCircuits (ASICs), Program-specific Standard Products (ASSPs),System-on-a-chip systems (SOCs), Complex Programmable Logic Devices(CPLDs), etc. Applications that may include the apparatus and systems ofvarious aspects can broadly include a variety of electronic and computersystems. Techniques may be implemented using two or more specificinterconnected hardware modules or devices with related control and datasignals that can be communicated between and through the modules, or asportions of an application-specific integrated circuit. Additionally,the techniques described herein may be implemented by software programsexecutable by a computer system. As an example, implementations caninclude distributed processing, component/object distributed processing,and parallel processing. Moreover, virtual computer system processingcan be constructed to implement one or more of the techniques orfunctionality, as described herein.

I. Exemplary Computing Environment

FIG. 1 illustrates a generalized example of a suitable computingenvironment (100) in which one or more of the described aspects may beimplemented. For example, one or more such computing environments can beused as a client device, a peer-to-peer device and/or a server device ina computer system that provides features such as a digital assistant,which can run on one or more such devices. Generally, various computingsystem configurations can be used. Examples of well-known computingsystem configurations that may be suitable for use with the tools andtechniques described herein include, but are not limited to, serverfarms and server clusters, personal computers, server computers, smartphones, laptop devices, slate devices, game consoles, multiprocessorsystems, microprocessor-based systems, programmable consumerelectronics, network PCs, minicomputers, mainframe computers,distributed computing environments that include any of the above systemsor devices, and the like.

The computing environment (100) is not intended to suggest anylimitation as to scope of use or functionality of the invention, as thepresent invention may be implemented in diverse types of computingenvironments.

With reference to FIG. 1, various illustrated hardware-based computercomponents will be discussed. As will be discussed, these hardwarecomponents may store and/or execute software. The computing environment(100) includes at least one processing unit or processor (110) andmemory (120). In FIG. 1, this most basic configuration (130) is includedwithin a dashed line. The processing unit (110) executescomputer-executable instructions and may be a real or a virtualprocessor. In a multi-processing system, multiple processing unitsexecute computer-executable instructions to increase processing power.The memory (120) may be volatile memory (e.g., registers, cache, RAM),non-volatile memory (e.g., ROM, EEPROM, flash memory), or somecombination of the two. The memory (120) stores software (180)implementing proximity-based engagement with digital assistants. Animplementation of proximity-based engagement with digital assistants mayinvolve all or part of the activities of the processor (110) and memory(120) being embodied in hardware logic as an alternative to or inaddition to the software (180).

Although the various blocks of FIG. 1 are shown with lines for the sakeof clarity, in reality, delineating various components is not so clearand, metaphorically, the lines of FIG. 1 and the other figures discussedbelow would more accurately be grey and blurred. For example, one mayconsider a presentation component such as a display device to be an I/Ocomponent (e.g., if the display device includes a touch screen). Also,processors have memory. The inventors hereof recognize that such is thenature of the art and reiterate that the diagram of FIG. 1 is merelyillustrative of an exemplary computing device that can be used inconnection with one or more aspects of the technology discussed herein.Distinction is not made between such categories as “workstation,”“server,” “laptop,” “handheld device,” etc., as all are contemplatedwithin the scope of FIG. 1 and reference to “computer,” “computingenvironment,” or “computing device.”

A computing environment (100) may have additional features. In FIG. 1,the computing environment (100) includes storage (140), one or moreinput devices (150), one or more output devices (160), and one or morecommunication connections (170). An interconnection mechanism (notshown) such as a bus, controller, or network interconnects thecomponents of the computing environment (100). Typically, operatingsystem software (not shown) provides an operating environment for othersoftware executing in the computing environment (100), and coordinatesactivities of the components of the computing environment (100).

The memory (120) can include storage (140) (though they are depictedseparately in FIG. 1 for convenience), which may be removable ornon-removable, and may include computer-readable storage media such asflash drives, magnetic disks, magnetic tapes or cassettes, CD-ROMs,CD-RWs, DVDs, which can be used to store information and which can beaccessed within the computing environment (100). The storage (140)stores instructions for the software (180).

The input device(s) (150) may be one or more of various input devices.For example, the input device(s) (150) may include a user device such asa mouse, keyboard, trackball, etc. The input device(s) (150) mayimplement one or more natural user interface techniques, such as speechrecognition, touch and stylus recognition, recognition of gestures incontact with the input device(s) (150) and adjacent to the inputdevice(s) (150), recognition of air gestures, head and eye tracking,voice and speech recognition, sensing user brain activity (e.g., usingEEG and related methods), and machine intelligence (e.g., using machineintelligence to understand user intentions and goals). As otherexamples, the input device(s) (150) may include a scanning device; anetwork adapter; a CD/DVD reader; or another device that provides inputto the computing environment (100). The output device(s) (160) may be adisplay, printer, speaker, CD/DVD-writer, network adapter, or anotherdevice that provides output from the computing environment (100). Theinput device(s) (150) and output device(s) (160) may be incorporated ina single system or device, such as a touch screen or a virtual realitysystem.

The communication connection(s) (170) enable communication over acommunication medium to another computing entity. Additionally,functionality of the components of the computing environment (100) maybe implemented in a single computing machine or in multiple computingmachines that are able to communicate over communication connections.Thus, the computing environment (100) may operate in a networkedenvironment using logical connections to one or more remote computingdevices, such as a handheld computing device, a personal computer, aserver, a router, a network PC, a peer device or another common networknode. The communication medium conveys information such as data orcomputer-executable instructions or requests in a modulated data signal.A modulated data signal is a signal that has one or more of itscharacteristics set or changed in such a manner as to encode informationin the signal. By way of example, and not limitation, communicationmedia include wired or wireless techniques implemented with anelectrical, optical, RF, infrared, acoustic, or other carrier.

The tools and techniques can be described in the general context ofcomputer-readable media, which may be storage media or communicationmedia. Computer-readable storage media are any available storage mediathat can be accessed within a computing environment, but the termcomputer-readable storage media does not refer to propagated signals perse. By way of example, and not limitation, with the computingenvironment (100), computer-readable storage media include memory (120),storage (140), and combinations of the above.

The tools and techniques can be described in the general context ofcomputer-executable instructions, such as those included in programmodules, being executed in a computing environment on a target real orvirtual processor. Generally, program modules include routines,programs, libraries, objects, classes, components, data structures, etc.that perform particular tasks or implement particular abstract datatypes. The functionality of the program modules may be combined or splitbetween program modules as desired in various aspects.Computer-executable instructions for program modules may be executedwithin a local or distributed computing environment. In a distributedcomputing environment, program modules may be located in both local andremote computer storage media.

For the sake of presentation, the detailed description uses terms like“determine,” “determine,” “present,” “identify,” “generate,” and“operate” to describe computer operations in a computing environment.These and other similar terms are high-level descriptions for operationsperformed by a computer, and should not be confused with acts performedby a human being, unless performance of an act by a human being (such asa “user”) is explicitly noted. The actual computer operationscorresponding to these terms vary depending on the implementation.

II. System with Proximity-Based Engagement with Digital Assistants

In the discussion of the system with proximity-based engagement withdigital assistants, note that communications between the various devicesand components discussed herein can be sent using computer systemhardware, such as hardware within a single computing device, hardware inmultiple computing devices, and/or computer network hardware. Acommunication or data item may be considered to be sent to a destinationby a component if that component passes the communication or data itemto the system in a manner that directs the system to route the item orcommunication to the destination, such as by including an appropriateidentifier or address associated with the destination. Also, a data itemmay be sent in multiple ways, such as by directly sending the item or bysending a notification that includes an address or pointer for use bythe receiver to access the data item. In addition, multiple requests maybe sent by sending a single request that requests performance ofmultiple tasks.

A. General Proximity-Based Digital Assistant Engagement System

Referring now to FIG. 2, components of a computerized proximity-baseddigital assistant engagement system (200) will be discussed. Each of thecomponents includes hardware and may also include software. For example,a component of FIG. 2 can be implemented entirely in computer hardware,such as in a system on a chip configuration. Alternatively, a componentcan be implemented in computer hardware that is configured according tocomputer software and running the computer software. The components canbe distributed across computing machines or grouped into a singlecomputing machine in various different ways. For example, a singlecomponent may be distributed across multiple different computingmachines (e.g., with some of the operations of the component beingperformed on one or more client computing devices and other operationsof the component being performed on one or more machines of a server).

Referring to FIG. 2, a digital assistant engagement system (200) isillustrated in a client-server configuration to illustrate an example ofa system that can be used for the tools and techniques discussed herein.In the client-server configuration shown, the components of the system(200) can include client devices (210), which can run computerapplications. For example, the client devices (210) can include a laptop(212) having a display (214) as an output device and running laptopapplications (216). The client devices (210) can also include asmartphone (218) having a display (220) as an output device and runningsmartphone applications (222). As another example, the client devices(210) may include an intelligent speaker device (224) having a speaker(226) as an output device and running intelligent speaker applications(228). Different types of devices can be used with tools and techniquesdiscussed herein. Also, the devices discussed herein can includeadditional or different components than discussed above. For example,smartphones and laptops typically include speaker output devices, andsuch devices may include far-field microphones so that they can operatesimilarly to dedicated intelligent speaker devices.

The digital assistant engagement system (200) can also include a network(230) through which one or more of the client devices (210) cancommunicate with a digital assistant service (240), which can work withthe client devices (210) to host a natural language digital assistantthat can include software running on the hardware of one or more of theclient devices (210) and the digital assistant service (240).

The system (200) can also include a content service (270), which canprovide content, such as Web pages, video content, images, and/or othercontent to the client devices (210) and/or the digital assistant service(240). The digital assistant engagement system (200) may also includeadditional components, such as additional search services, contentservices, and other services that can be invoked by the digitalassistant service (240) and/or the client devices (210).

B. Example of a Proximity-Based Digital Assistant Engagement System

Referring now to FIG. 3, a schematic diagram illustrates a digitalassistant engagement system (300). The digital assistant engagementsystem (300) of FIG. 3 includes some components from the client-serverdigital assistant engagement system (200) of FIG. 2 and adds additionalillustrations of components and communications. The digital assistantengagement system (300) of FIG. 3 may operate in a client-serverconfiguration like FIG. 2, or in some other configuration (e.g., apeer-to-peer configuration).

The digital assistant engagement system (300) of FIG. 3 can include adigital assistant (302) running in the system. The digital assistant(302) can run in a distributed manner. For example, the digitalassistant (302) may include components running in one or more clientdevices like the client devices (210) of FIG. 2 and in a remote digitalassistant service such as the digital assistant service (240) of FIG. 2.

The digital assistant (302) can receive natural language input. Forexample, the natural language input can be received at a client devicesuch as the intelligent speaker device (224) in the form of audiblespeaking, or in the form of natural language text entered through akeyboard or other text input device, such as through a touch screen of asmartphone (218). As illustrated in FIG. 3, the digital assistant (302)is operating on the intelligent speaker device (224) and on another setof one or more devices, such as a remote digital assistant service.

In the case of audible input to the digital assistant (302), the inputcan be processed using a speech-to-text component that can be part of anintent understanding component. The speech-to-text component may use oneor more existing speech-to-text processes and may even invoke anexisting speech-to-text engine by passing the speech to the existingspeech-to-text engine and receiving results from the engine. Forexample, the speech-to-text component may utilize an existing overallprocess such as a Hidden-Markov Model-based process, a dynamic timewarping process, or a neural network process. The speech-to-textcomponent may also use one or more performance improving techniques,such as context dependency; cepstral normalization; vocal tract lengthnormalization; maximum likelihood linear regression; delta anddelta-delta coefficients; heteroscedastic linear discriminant analysis(LDA); splicing and an LDA-based projection followed by heteroscedasticlinear discriminant analysis or a global semi-tied co-variancetransform; discriminative training techniques; and/or otherspeech-to-text performance enhancing techniques.

Data representing text of natural language instructions (whetherreceived as text or produced by the speech-to-text component) can beprovided to a language understanding component, which can be part of theintent understanding component. As an example, a pre-existing languageunderstanding component may be invoked by passing the natural languagetext (and, in some cases, other information such as a key and aconversation identifier) to the component with a request to returnintents representing the meaning(s) of the natural language text.Different keys and/or application identifiers submitted to the languageunderstanding component may be used for different natural languages,thereby signaling to the language understanding component which languageis being used. The language understanding component may include one ormore known components for natural language understanding. In oneexample, the natural language understanding component may use multiclassclassification, for example via a neural network with softmax output,multinomial logistic regression, naïve Bayes classification, and othermachine learning techniques. More generally, examples of the languageunderstanding component may utilize a lexicon of the natural language,as well as a parser and grammar rules to break each natural languagephrase into a data representation of the phrase. The languageunderstanding component may also utilize a semantic theory to guidecomprehension, such as a theory based on naïve semantics, stochasticsemantic analysis, and/or pragmatics to derive meaning from context.Also, the language understanding component may incorporate logicalinference techniques such as by mapping a derived meaning into a set ofassertions in predicate logic, and then using logical deduction toarrive at conclusions as to the meaning of the text. Using results ofsuch language understanding techniques, the language understandingcomponent can map the resulting derived meanings to one or more intents,which can be sent to one or more components of the digital assistant(302) and/or to components elsewhere in the digital assistant engagementsystem (300). For example, such intents can be provided to a decisionengine (304), which is part of the digital assistant (302) in theimplementation illustrated in FIG. 3. As an example, the decision engine(304) may be running in an online service that is configured tocommunicate with client devices, such as the digital assistant service(240) discussed above with reference to FIG. 2.

Still referring to FIG. 3, the digital assistant engagement system (300)can include a task monitor (310) running in the smartphone (218) toprovide information on the state of the device in which the task monitor(310) is running. The task monitor (310) is illustrated as not beingpart of the digital assistant (302). However, the task monitor (310) cangenerate and send task information (312) to the decision engine (304) inthe digital assistant (302). For example, the task information (312) canbe data indicating one or more states of the corresponding device (thesmartphone (218), in this example), such as data that indicates whattask is being performed in the smartphone (218) and/or data describingfeatures of that task (e.g., content of a Web page being viewed) Suchinformation can be collected by the task monitor (310) in one or more ofvarious ways, such as using application programming interface callsexposed by an operating system and/or applications running on anoperating system in the smartphone (218). Also, in some implementations,the task monitor (310) may be part of the digital assistant (302). Forexample, the task monitor may be included with a client-side applicationfor the digital assistant (302). Such a configuration is illustrated fora task monitor (320) in FIG. 3, which is running in the intelligentspeaker device (224). The task monitor (320) can operate similarly tothe task monitor (310), generating and sending task information (322) tothe decision engine (304). The task information (322) can includesimilar information to the task information (312) discussed above.

The digital assistant (302) can also include a proximity detector (330)running in the intelligent speaker device (224). Another proximitydetector may be running in the smartphone (218) in addition to orinstead of the proximity detector (330) in the intelligent speakerdevice (224). The proximity detector (330) can collect information onthe device distance (332) between the device in which the proximitydetector (330) is running and another device that is configured to workwith the digital assistant (302) for proactive engagement with thedigital assistant (302).

The proximity detector can operate in different ways. For example,proximity detection may be based on analyzing signals formattedaccording to protocols, such as with near-field communication,radio-frequency identification, and/or Bluetooth®. The proximitydetector (330) can obtain data about signals formatted according to suchprotocols using application programming interfaces for componentsrunning in the intelligent speaker device. As one example, the strengthof a Bluetooth® signal between two devices can be used as an indicatorof proximity between the devices. FIG. 4 is a received signal strengthindication (RSSI)-to-device distance chart (400) illustrating an exampleof how Bluetooth® signal strength between two devices changes asdistance between the two devices changes. In the chart, the verticalaxis is a received signal strength indication (RSSI) axis (410),indicating the Bluetooth® received signal strength indication. Thehorizontal axis is a time in seconds axis (420). Additionally, the chart(400) shows distance in meters (430) for blocks of time. The plottedsignal strength line (440) in the chart indicates the strength of thesignal as a function of time and as a function of device distance. Thedistances and other variables in the chart may not be exact—for example,it may have taken some time to move devices from one distance toanother. However, the chart (400) illustrates an example of a measurablevalue, namely Bluetooth® received signal strength indication, that canbe used as an indication of distance between two devices.

In using data such as Bluetooth® signal strength to estimate devicedistance, operations may be performed on the data to improveperformance. For example, a smoothing operation may be performed on thesignal strength values, such as using a moving average of the signalstrength values to estimate distance. The resulting processed values canbe monitored to provide an indication that an intent to engage with adigital assistant (302) may be present. For example, a threshold valuefor received signal strength indication between two devices may bemonitored to determine whether it rises above a threshold value,indicating proximity Additionally, such proximity values may bemonitored for other indicators of intent to engage. For example, theproximity values prior to the threshold being reached may be monitored.For example, a speed of approach (indicated by a rate of change of thesignal strength, shown as the slope of the signal strength line on achart such as in FIG. 4) may be analyzed for a correlation to previousapproach speeds where there was an intent to engage with the digitalassistant (302). Other features, such as the environment, room topology,time of day, and/or other features may also be used in conjunction withproximity to indicate an intent to engage with a digital assistant(302).

Referring to FIG. 3, the digital assistant engagement system (300) mayinclude computer-readable user profiles (340), which can be stored andmaintained, such as by the digital assistant service (240) and/or otherservice cooperating with the digital assistant service. The userprofiles (340) can include user preferences (342), which can includepreferences as to engagement. For example, such preferences may includeexplicit preferences (344) provided by user input, such as through aclient device (such as through the smartphone (218) or the intelligentspeaker device (224)). The user preferences (342) may also includeimplicit preferences (346) that can be inferred from reactive input(352) of the user profile (340) to the computer system in response toproactive engagement actions (350). For example, it may be that the userprofile (340) always declines to accept proactive engagement between5:00 PM (17:00) and 6:00 PM (18:00) because that is a time thecorresponding user dedicates to conversing with his/her family andfriends. From this, the digital assistant (302) can infer an implicitpreference (346) to block any proactive engagement actions (350) duringthis time. For example, this inference may be achieved by tracking ratesof accepting proactive engagement actions (350) by the user profile atparticular times of day and detecting a significantly decreasedacceptance rate during a particular time block. Similarly, the userprofile (340) could provide an explicit instruction to block allproactive engagement actions (350) during that time block, such as byindicating that time block for a “do not disturb” setting in the userpreferences (342).

Referring still to FIG. 3, components of the decision engine (304) willnow be discussed. In general, the decision engine (304) can operate todetermine whether to take a proactive engagement action (350), toanalyze types of computer-readable engagement options (358), and toidentify engagement options (358) that meet corresponding qualifications(360) for the type of engagement option (358).

Specifically, the decision engine (304) can include an engagementqualifier (362), an output generator (364), an action predictor (366),and a content prefetcher (368). Each of these is discussed below.

The engagement qualifier (362) can determine whether a type ofengagement option (358) meets the corresponding qualifications (360) fortaking a corresponding proactive engagement action (350), such as a userengagement user interface output as illustrated with the proactiveengagement action (350) in FIG. 3. This may include determining whetherany engagement option (358) meets qualifications, and if so, it may alsoinclude determining which of the multiple available engagement options(350) best meets the qualifications (360). This may be done using theproximity information (334), and it may also include using othercomputer-readable information, such as the task information (312 and322), user preferences (342), as well as external information that canbe accessed by the engagement qualifier (362), such as time of day asindicated by a computer system clock. Other factors available fromcomputer-readable data may also be considered in addition to and/orinstead of these factors.

The qualifications (360) may be in one or more of various forms, such asrule-based qualifications and/or machine-learned qualifications. Forexample, the rule-based qualifications may have set threshold values ofone or more types, such as a proximity value threshold, a proximityspeed threshold, time ranges, etc. Rule-based qualifications may alsoinclude value comparisons, such as reading a value indicating aparticular task is being performed on a device and comparing that valuewith values in the qualifications (360) for tasks that correspond toavailable engagement options (358). For example, the qualifications(360) can map a task being performed on a device with an engagementoption (358) for starting an engagement that will invoke a particular“skill” or operation the digital assistant (302) is programmed toperform.

In one example, the digital assistant (302) may be programmed to providenatural language guidance for cooking a recipe through the intelligentspeaker device (224). If the user profile is viewing a recipe on thedisplay (220) of the smartphone (218), the smartphone (218) is in closeproximity to the intelligent speaker device (224) (such with across-device signal strength above a threshold value or otherindications of the estimated distance being below a threshold value),and the time of day is a time that user preferences (342) indicate as apreferable time to cook, then all these can be considered as indicatingthe qualifications (360) are met for an engagement option (358) thatinvolves a proactive engagement action (350) for initiating a computercommunication session with the digital assistant to provide cookingguidance for the recipe. Thus, the proactive engagement action (350) maybe an audible speech output on the intelligent speaker device (224)saying “Let's cook!” If the indications for an intent to engage in thecooking guidance session are not as strong (the device is not as close,it is not a typical time for cooking, and/or there is no recipecurrently displayed on the smartphone (218), for example), then adifferent engagement option (358) for that same session may be selectedas meeting the qualifications (360). For example, the audible outputspeech phrase may be a more tentative phrase, such as, “Do you want tocook?”

In addition to or instead of rule-based qualifications, machine learningqualifications may be used. Such machine learning may include one ormore models, such as deep neural networks, which can receiverepresentations of one or more factors, such as those discussed above.For example, available data about the state of the digital assistantengagement system (300), such as the types of data discussed above, canbe used to compute one or more vectors. Such vector(s) can be input intoone or more machine learning models to form one or more output vectors.The resulting output of the model (such as the one or more vectors) canbe compared to expected outputs for one or more different engagementoptions (358) (such as one or more expected vectors). The similarity ofthe compared vectors can be used as indicators of intent to engage withthe digital assistant (302), thus increasing the likelihood of meetingthe qualifications (360) for a corresponding engagement option (358).The model for processing the factors for current situations can betrained using standard training techniques, such as supervised orunsupervised training techniques. For example, backpropagation may beused to train deep neural networks that may be used for identifyingwhether qualifications (360) are met for one or more engagement options(358).

In one implementation, a vector can be computed for each of one or moreengagement options (358). If the proximity information (334) revealsthat the estimation of the device distance (332) (such as a signalstrength value) indicates the distance is within a distance threshold(336), then a vector analysis of a current task being performed on thesmartphone (218) and/or the intelligent speaker device (224) can beanalyzed. For example, a vector can also be computed for a task beingperformed on the smartphone (218), such as by computing a vector usingdata for what application is being used and/or data representingparticular content being presented in that application. The vector forthe current task and the vector for the engagement option (358) may becompared. For example, an overlap can be computed between the extractedkeywords and entities for the current task and the keywords and entitiesfor the digital assistant's skill associated with the engagement option.As an example, a cosine may be computed between the two vectors (thevector from the task on the smartphone (218) and the vector computedfrom expected tasks for the engagement option (358) being analyzed) toprovide a score that can be used in determining whether the engagementoption (358) meets the qualifications (360) for the current state of thedigital assistant engagement system (300). That score from the vectorcomparison may be used alone or with other scores, such as comparisonsof other vectors for expected values with vectors computed from currentvalues (e.g., values for an expected time of day). The results of thevector comparisons can be combined if multiple such comparisons areperformed (e.g., by performing a weighted addition or multiplicationoperation, or some other similar combining operation), which can producean overall score representing an estimation of a likelihood that thereis an intent for the engagement option (358) (i.e., whether thecorresponding user or user profile intends to engage in the mannerrepresented by the engagement option). Also, if initial qualificationsare met for multiple engagement options (358) for an engagementopportunity, then a ranking technique may be used to rank the differentengagement options (358). Such a ranking technique can compare scoresfor the different engagement options (358) using factors based onvalues, such as those discussed above. The highest scoring engagementoption (358) can be considered to be the one engagement option thatmeets the qualifications (360), where such qualifications (360) can beconsidered to include a requirement for being the highest-rankingengagement option (358) for an opportunity.

Other techniques may be used in addition to, or instead of thetechniques discussed above in determining whether, for a particularstate of the digital assistant engagement system (300), thequalifications (360) are met for an engagement option (358).

If the engagement qualifier (362) determines that the qualifications(360) are met for an engagement option (358) that includes an outputproactive engagement action (350), then the engagement qualifier caninvoke the output generator (364) to generate the particular output, andtake the proactive engagement action (350) by sending the output to oneor more of the client devices, such as the smartphone (218) and/or theintelligent speaker device (224). For example, this may includegenerating a natural language dialog script and/or a user interfacedisplay, and sending such output to an application on the smartphone(218) or audibly speaking a natural language dialog script with theintelligent speaker device (224). Reactive input (352) may then bereceived from the user profile (340) via one of the devices (typicallythe device where the output for the proactive engagement action (350) ispresented, but, in some cases, another device) to either confirm orreject the proactive engagement action (350). That confirmation orrejection may be analyzed and used in forming implicit preferences (346)for that user profile, which can then be used for future qualificationdeterminations.

An action predictor (366) can perform similar determinations to thosediscussed above for the engagement qualifier. However, the actionpredictor (366) can determine whether qualifications (360) are met forengagement options (358) that involve taking actions now in anticipationof future actions. For example, if the user preferences (342) bar takinga proactive engagement action (350) in the form of an output action now,then the action predictor (366) can determine whether the qualifications(360) are met for taking preparatory actions now in anticipation of afuture engagement. This determination of whether qualifications (360)are met can use techniques similar to those discussed above for theengagement qualifier (362). If the action predictor (366) determinesthat qualifications (360) are met for an engagement option (358), suchas an option that involves prefetching content for use in a futureaction, then the action predictor (366) can invoke the contentprefetcher (368) to prefetch content for use in the future action. Forexample, the content prefetcher (368) may retrieve online content suchas Web pages, audio clips, video clips, or images that are commonlyrequested in a corresponding type of session with the digital assistant.

The operation of the digital assistant engagement system can includecollecting, storing, transmitting, and otherwise handling information.In performing such operations, privacy should be considered andrespected. For example, the digital assistant engagement system (300)may include opt-in or opt-out settings for user profiles to allow usersto control how their information, and especially their personalinformation, is collected, stored, transmitted, and/or used in the taskguidance system. Also, security measures such as data encryption andsecure transmissions can be used to protect such information from beinginadvertently exposed to third parties. Additionally, operations in thedigital assistant engagement system (300) can be limited in accordancewith appropriate privacy policies and user profile settings.

C. Example Using the Proximity-Based Digital Assistant Engagement System

A specific example of using the digital assistant engagement system(300) will now be discussed, though many other different examples may beimplemented. In this example, a user can place a smartphone (218) nextto an intelligent speaker device (224). The digital assistant (302)running on the intelligent speaker device (224) can recognize thesmartphone (218), such as based on a unique identifier that pairs thesmartphone (218) with the intelligent speaker device (224) (e.g., via aBluetooth® identifier). Other ways of device recognition may be usedinstead of such Bluetooth® identification. The digital assistant (302)can regard placement of one device immediately next to another device(as determined from proximity information) as an implicit cue forengagement, in part because the user has done the same thing many timesbefore and those prior engagements have been recorded as implicitpreferences (346) for future use, with the implicit preferences beingconnected to that user's user profile (340) in the digital assistantengagement system (300).

In response to recognizing the proximity cue, the digital assistant(302) can read task information (312) from the smartphone (218) and/ortask information (322) from the intelligent speaker device (224). Thedigital assistant (302) can confirm that the tasks are not tasks forwhich proactive engagement is blocked. For example, if one of thedevices were in the process of finalizing a purchase, that couldindicate that engagement was not intended at that time. Instead, thedigital assistant (302) may determine that it can assist with the taskbeing performed on the smartphone (218). For example, a recipe may bedisplayed on the smartphone display (220), and guidance in preparingrecipes may be a known skill for the digital assistant (302). Thedigital assistant (302) can reach out via a proactive engagement action(350), such as a proactive voice request via the intelligent speakerdevice (224), offering to help prepare the recipe currently displayed onthe smartphone display (220). In anticipation of a reactive input (352)accepting the requested help, the digital assistant (302) mayproactively fetch resources that can help the user prepare items in therecipe (e.g., videos from an online video service). Upon user acceptanceof the offer to help in a reactive input (352), the digital assistant(302) can display content on the smartphone (218), such as a currentstep in the recipe and additional resources, such as pre-fetched videos.

III. Proximity-Based Digital Assistant Engagement Techniques

Several proximity-based digital assistant engagement techniques will nowbe discussed. Each of these techniques can be performed in a computingenvironment. For example, each technique may be performed in a computersystem that includes at least one processor and memory includinginstructions stored thereon that when executed by at least one processorcause at least one processor to perform the technique (memory storesinstructions (e.g., object code), and when processor(s) execute(s) thoseinstructions, processor(s) perform(s) the technique). Similarly, one ormore computer-readable memory may have computer-executable instructionsembodied thereon that, when executed by at least one processor, cause atleast one processor to perform the technique. The techniques discussedbelow may be performed at least in part by hardware logic. Featuresdiscussed in each of the techniques below may be combined with eachother in any combination not precluded by the discussion herein,including combining features from a technique discussed with referenceto one figure in a technique discussed with reference to a differentfigure. Also, a computer system may include means for performing each ofthe acts discussed in the context of these techniques, in differentcombinations.

A. FIG. 5 Technique

Referring to FIG. 5, a proximity-based digital assistant engagementtechnique will be discussed. The technique can include detecting (510) aproximity of a first device with a second device. The first and seconddevices can each be a computerized device in a computer system. Thefirst and second devices can each have a user interface component. Also,each device can be capable of operating independently of the otherdevice. In response to the detecting (510) of the proximity, thetechnique can include determining (520) that a computer-readablequalification for a type of proactive engagement is met using thedetected proximity and a non-proximity state of the first device. Notethat a single detection of the proximity may be used to fulfill theproximity detection (510) and the use of the detected proximity in thedetermining (520) that the qualification is met, if the detectedproximity is used in determining that requirements are met (such asdetermining that a proximity-indicating value is above or below athreshold value indicating sufficient proximity). For example, thisqualification determination can include accessing and usingcomputer-readable qualification guidelines (whether set guidelines in arule-based technique or more variable guidelines in a machine learningtechnique). The type of proactive engagement can be one of multipleavailable computer-readable options for proactive engagements in thecomputer system, such as options for engagement via a digital assistant.In response to the determining (520) that the qualification for the typeof proactive engagement is met, the technique can include generating auser engagement user interface output of a type indicated by the type ofproactive engagement. In response to the generating (530) of the output,the technique can include presenting (540) the generated output on thesecond device via the computerized natural language digital assistantoperating in the computer system. The presenting (540) of the generatedoutput can initiate a session with the digital assistant in the computersystem. Following are some additional features that may be used withthis FIG. 5 technique, either alone or in combination with each other.

The first device can be a mobile device, such as a laptop or smartphone,and the second device can be a stationary device, such as a desktopcomputer or a stationary intelligent speaker device.

The determining (520) that the qualification for the type of proactiveengagement is met can use a state of the second device, such as anon-proximity state of the second device.

As an example, the state of the first device can include a task beingperformed by the first device and the state of the second device caninclude a task being performed by the second device.

The FIG. 5 technique may further include receiving a user input responseto the generated output in the computer system, wherein the user inputresponse confirms the type of proactive engagement.

The technique of FIG. 5 may further include, in response to thedetermining that the qualification for the type of proactive engagementis met, performing a preparatory action that prepares the computersystem for the session in anticipation of a subsequent user input actionconfirming the type of proactive engagement. For example, such apreparatory action may be indicated as part of the type of proactiveengagement, along with the generated output.

The technique of FIG. 5 may further include receiving the subsequentuser input action confirming the type of proactive engagement, and usingcomputer-readable data prepared by the preparatory action in conductingthe session following the receiving of the user input action confirmingthe type of proactive engagement.

The state of the first device can include a computerized task beingperformed using the first device during the detection of the proximity.Also, the session can include using the digital assistant to enhance thetask using the second device. For example, if the first device isdisplaying a recipe, the session with the digital assistant may includethe digital assistant providing guidance on cooking the recipe. As anexample, this may occur by providing audio or video instructions asguidance in preparing the recipe.

The determining (520) that the qualification for the type of proactiveengagement is met can include comparing digital data being used for atask being performed using the first device with digital data matchedwith the multiple available computer-readable options for proactiveengagements in the computer system. For example, this can includeexploring the possibility that the digital data being used for the taskfrom the first device matches each of multiple possible proactiveengagement options, to determine whether any of them meet qualificationsand possibly determining which option has the best ranking.

The session can be between a user profile and the digital assistant, andthe determining (520) that the qualification for the type of proactiveengagement is met can include accounting for preferences of the userprofile that are stored in the computer system. The preferences of theuser profile can include implicit preferences inferred by the computersystem from past reactions of the user profile to user interface outputactions requesting engagement with the digital assistant.

Where the session is a session between a user profile and the digitalassistant, the determining that the qualification for the type ofproactive engagement is met can include accounting for preferences ofthe user profile that are stored in the computer system. The preferencesof the user profile can include explicit preferences, with the explicitpreferences being stored in the computer system in response to userinput explicitly requesting the explicit preferences.

B. FIG. 6 Technique

Referring to FIG. 6, another proximity-based digital assistantengagement technique will be discussed. The technique of FIG. 6 caninclude detecting (610) a proximity of a first computing device with asecond computing device. The first and second computing devices can eachbe a computerized device, with each device having a user interfacecomponent such as a microphone and speaker, or a touchscreen. Inresponse to the detecting (610) of the proximity, it can be determined(620) that a computer-readable qualification for a type of proactiveengagement is met using the detected proximity and a non-proximity stateof the first device. The type of proactive engagement can be matched inthe computer system with an operation that is programmed to be performedusing a computerized natural language digital assistant. In response tothe determining (620) that the qualification for the type of proactiveengagement is met, the technique can include performing (630) a userengagement action via the second device. The user engagement action canbe of a type indicated by the type of proactive engagement. The userengagement action can facilitate a computerized communication sessionbetween a computer-readable user profile and the computerized naturallanguage digital assistant in the computer system. Following are someadditional features that may be used with this FIG. 5 technique, eitheralone or in combination with each other.

The state of the first device can include a computerized task beingperformed using the first device during the detection of the proximity,and the session can include using the digital assistant to enhance thetask using the second device.

The user engagement action can be a user engagement user interfaceoutput via the second device, with the user engagement action initiatingthe session; and/or a preparatory action that prepares the computersystem for the session in anticipation of a subsequent user input actionconfirming the type of proactive engagement.

Accordingly, the technique of FIG. 6 may further include receiving thesubsequent user input action; responding to the subsequent user inputaction by performing a user engagement user interface output via thesecond device, with the user interface output being part of the sessionvia the digital assistant; and conducting the session usingcomputer-readable data prepared by the preparatory action.

The determining (620) that the qualification for the qualified type ofproactive engagement is met can include selecting between at least thefollowing types of available proactive engagements: a first type ofproactive engagement for a first action that is a current userengagement user interface output via the second device, with the firstaction initiating the session; and a second type of proactive engagementfor a second action that is a preparatory action, which prepares thecomputer system for the session in anticipation of a subsequent userinput action indicating an intent to engage in the session. Thedetermining (620) may include determining that either type of action, orboth types of actions, meet the requirements and are to be performed.Thus, a single engagement option may correspond to a single engagementaction or multiple engagement actions.

The determining (620) that the qualification for the type of proactiveengagement is met can include comparing digital data being used for atask being performed using the first device with digital data matchedwith the multiple available computer-readable options for proactiveengagements in the computer system.

The determining (620) that the qualification for the type of proactiveengagement is met can include accounting for a time of day.

C. FIG. 7 Technique

Referring to FIG. 7, yet another proximity-based digital assistantengagement technique will be discussed. The technique can includedetecting (710) a proximity of a first computing device with a secondcomputing device in a computer system. The first and second devices caneach have a user interface component. In response to the detecting (710)of the proximity, it can be determined (720) that a computer-readablequalification for a type of proactive engagement is met using thedetected proximity. The type of proactive engagement can be one ofmultiple available computer-readable options for proactive engagementsthat are matched in the computer system with operations that areprogrammed to be performed using a computerized natural language digitalassistant. The determining (720) that the qualification for thequalified type of proactive engagement is met can include selectingbetween at least the following types of available proactive engagements:a first type of proactive engagement for a first action that is acurrent user engagement user interface output via the second device,with the first action initiating a computerized communication sessionbetween a computer-readable user profile and the computerized naturallanguage digital assistant in the computer system via the second device;and a second type of proactive engagement for a second action that is apreparatory action, which prepares the computer system for the sessionin anticipation of a subsequent user input action indicating an intentto engage in the session. The technique of FIG. 7 can further include,in response to the determining that the qualification for the qualifiedtype of proactive engagement is met, performing (730) a user engagementaction via the second device, with the user engagement action being of atype indicated by the qualified type of proactive engagement, and withthe user engagement action facilitating the session.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

We claim:
 1. A computer system comprising: at least one processor; andmemory comprising instructions stored thereon that when executed by atleast one processor cause at least one processor to perform actscomprising: detecting via the computer system, a proximity of a firstdevice with a second device, with the first device and the second deviceeach being a computerized device, and with the first device having afirst user interface component and the second device having a seconduser interface component; in response to the detecting of the proximity,determining, via the computer system, that a computer-readablequalification for a type of proactive engagement is met using thedetected proximity and a non-proximity state of the first device, withthe type of proactive engagement being one of multiple availablecomputer-readable options for proactive engagements in the computersystem; in response to the determining that the qualification for thetype of proactive engagement is met, generating, via the computersystem, a user engagement user interface output of a type indicated bythe type of proactive engagement; and in response to the generating ofthe output, presenting the generated output on the second device via acomputerized natural language digital assistant operating in thecomputer system, with the presenting of the generated output initiatinga session with the digital assistant in the computer system.
 2. Thecomputer system of claim 1, wherein the first device is a mobile deviceand the second device is a stationary device.
 3. The computer system ofclaim 1, wherein the determining that the qualification for the type ofproactive engagement is met uses a state of the second device.
 4. Thecomputer system of claim 3, wherein the state of the first devicecomprises a task being performed by the first device and the state ofthe second device comprises a task being performed by the second device.5. The computer system of claim 3, wherein the acts further comprisereceiving a user input response to the generated output in the computersystem, wherein the user input response confirms the type of proactiveengagement.
 6. The computer system of claim 1, wherein the acts furthercomprise, in response to the determining that the qualification for thetype of proactive engagement is met, performing a preparatory actionthat prepares the computer system for the session in anticipation of asubsequent user input action confirming the type of proactiveengagement.
 7. The computer system of claim 6, wherein the acts furthercomprise: receiving the subsequent user input action confirming the typeof proactive engagement; and using computer-readable data prepared bythe preparatory action in conducting the session following the receivingof the user input action confirming the type of proactive engagement. 8.The computer system of claim 1, wherein the state of the first devicecomprises a computerized task being performed using the first deviceduring the detection of the proximity, and wherein session comprisesusing the digital assistant to enhance the task using the second device.9. The computer system of claim 1, wherein the determining that thequalification for the type of proactive engagement is met comprisescomparing digital data being used for a task being performed using thefirst device with digital data matched with the multiple availablecomputer-readable options for proactive engagements in the computersystem.
 10. The computer system of claim 1, wherein the session is asession between a user profile and the digital assistant, and whereinthe determining that the qualification for the type of proactiveengagement is met comprises accounting for preferences of the userprofile that are stored in the computer system, with the preferences ofthe user profile comprising implicit preferences inferred by thecomputer system from past reactions of the user profile to userinterface output actions requesting engagement with the digitalassistant.
 11. The computer system of claim 1, wherein the session is asession between a user profile and the digital assistant, and whereinthe determining that the qualification for the type of proactiveengagement is met comprises accounting for preferences of the userprofile that are stored in the computer system, with the preferences ofthe user profile comprising explicit preferences, with the explicitpreferences being stored in the computer system in response to userinput explicitly requesting the explicit preferences.
 12. Acomputer-implemented method, comprising: detecting via a computersystem, a proximity of a first computing device with a second computingdevice, with the first device and the second device each being acomputerized device in the computer system, and with the first devicehaving a first user interface component and the second device having asecond user interface component; in response to the detecting of theproximity, determining, via the computer system, that acomputer-readable qualification for a type of proactive engagement ismet using the detected proximity and a non-proximity state of the firstdevice, with the type of proactive engagement being matched in thecomputer system with an operation that is programmed to be performedusing a computerized natural language digital assistant; and in responseto the determining that the qualification for the type of proactiveengagement is met, performing a user engagement action via the seconddevice, with the user engagement action being of a type indicated by thetype of proactive engagement, and with the user engagement actionfacilitating a computerized communication session between acomputer-readable user profile and the computerized natural languagedigital assistant in the computer system.
 13. The method of claim 12,wherein the state of the first device comprises a computerized taskbeing performed using the first device during the detection of theproximity, and wherein session comprises using the digital assistant toenhance the task using the second device.
 14. The method of claim 12,wherein the user engagement action comprising a user engagement userinterface output via the second device, with the user engagement actioninitiating the session.
 15. The method of claim 12, wherein the userengagement action is preparatory action that prepares the computersystem for the session in anticipation of a subsequent user input actionconfirming the type of proactive engagement.
 16. The method of claim 15,further comprising: receiving the subsequent user input action;responding to the subsequent user input action by performing a userengagement user interface output via the second device, with the userinterface output being part of the session via the digital assistant;and conducting the session using computer-readable data prepared by thepreparatory action.
 17. The method of claim 12, wherein the type ofproactive engagement is a qualified type of proactive engagement, andwherein the determining that the qualification for the qualified type ofproactive engagement is met comprises selecting between at least thefollowing types of proactive engagements: a first type of proactiveengagement for a first action that is a current user engagement userinterface output via the second device, with the first action initiatingthe session; and a second type of proactive engagement for a secondaction that is a preparatory action, which prepares the computer systemfor the session in anticipation of a subsequent user input actionindicating an intent to engage in the session.
 18. The method of claim12, wherein the determining that the qualification for the type ofproactive engagement is met comprises comparing digital data being usedfor a task being performed using the first device with digital datamatched with the multiple available computer-readable options forproactive engagements in the computer system.
 19. The method of claim12, wherein the determining that the qualification for the type ofproactive engagement is met comprises accounting for a time of day. 20.One or more computer-readable memory having computer-executableinstructions embodied thereon that, when executed by at least oneprocessor, cause at least one processor to perform acts comprising:detecting via a computer system, a proximity of a first computing devicewith a second computing device, with the first device and the seconddevice each being a computerized device in the computer system, and withthe first device having a first user interface component and the seconddevice having a second user interface component; in response to thedetecting of the proximity, determining, via the computer system, that acomputer-readable qualification for a qualified type of proactiveengagement is met using the detected proximity, with the qualified typeof proactive engagement being one of multiple availablecomputer-readable options for proactive engagements that are matched inthe computer system with operations that are programmed to be performedusing a computerized natural language digital assistant, and with thedetermining that the qualification for the qualified type of proactiveengagement is met comprising selecting between at least the followingtypes of proactive engagements: a first type of proactive engagement fora first action that is a current user engagement user interface outputvia the second device, with the first action initiating a computerizedcommunication session between a computer-readable user profile and thecomputerized natural language digital assistant in the computer systemvia the second device; and a second type of proactive engagement for asecond action that is a preparatory action, which prepares the computersystem for the session in anticipation of a subsequent user input actionindicating an intent to engage in the session; and in response to thedetermining that the qualification for the qualified type of proactiveengagement is met, performing a user engagement action via the seconddevice, with the user engagement action being of a type indicated by thequalified type of proactive engagement, and with the user engagementaction facilitating the session.